Malaria Detection¶

Problem Definition¶

The context: Why is this problem important to solve?
The objectives: What is the intended goal?
The key questions: What are the key questions that need to be answered?
The problem formulation: What is it that we are trying to solve using data science?

Data Description ¶

There are a total of 24,958 train and 2,600 test images (colored) that we have taken from microscopic images. These images are of the following categories:

Parasitized: The parasitized cells contain the Plasmodium parasite which causes malaria
Uninfected: The uninfected cells are free of the Plasmodium parasites

Executive Summary:¶

Key Findings:

  • This comprehensive analysis reveals that Convolutional Neural Networks (CNNs) are highly efficient in discerning parasitized from uninfected blood cell images. The process leverages CNNs' intrinsic capability for feature extraction and classification, directly from the raw pixel data of blood smears—a critical step in diagnosing malaria.

Model Specifications:

  • The final proposed model is a CNN with strategic layering of convolutional layers activated by ReLU functions, pooling layers to distill features, dropout layers to mitigate overfitting, and dense layers for the final classification.
  • Structured sequentially, the model is specialized for 2D image data processing, targeting the detection of malaria parasites in blood cells with remarkable accuracy.

Model Performance:

  • The model's performance is underpinned by stellar accuracy, precision, recall, and F1-scores. These metrics are testament to its reliability and potential as a diagnostic tool in healthcare, providing a swift and accurate assessment of malaria infections.
  • Within this specific context, we must focus on minimizing false negatives, which means prioritizing recall for the positive class ('1' for "parasitized").
  • therefore the "best" model will be largely determined based on its performance on the test data, particularly its recall for the positive class.
    • Given the high cost of false negatives in medical diagnostics, a model that might have a slightly lower accuracy but higher recall could be preferable.

Problem and Solution Summary:

  • Malaria diagnosis through microscopic blood smear examination is time-consuming and requires specialized skills. Our solution, a CNN-based model trained on the NIH Malaria Dataset, automates this task. It stands out for its robust preprocessing methods like normalization and resizing, ensuring optimal performance.
  • Malaria is a significant burden on the healthcare systems of affected regions and it is the major cause of death in many developing countries. Early testing is necessary to detect malaria and save lives. Thus, it gives us the motivation to make malaria diagnosis more effective and faster. A specialized technology proves essential to combat this problem
  • The problem at hand is the need for an accurate, reliable, and efficient method of diagnosing malaria from blood cell images. Malaria diagnosis currently relies heavily on manual microscopy, which is time-consuming and requires significant expertise.
  • The proposed CNN model automates the detection process, reducing the need for expert intervention and speeding up diagnosis. This solution can scale the screening process, allowing for more widespread and rapid identification of malaria cases, which is critical in managing and treating the disease, especially in resource-constrained environments.

Mount the Drive

In [ ]:
# Mounting the drive
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive

Loading libraries¶

In [ ]:
import os #to create data paths
import random #to randomly select datapoints
import numpy as np                                                                # Importing numpy for Matrix Operations
import pandas as pd                                                               # Importing pandas to read CSV files
import matplotlib.pyplot as plt                                                   # Importting matplotlib for Plotting and visualizing images
import math                                                                       # Importing math module to perform mathematical operations
import cv2
from PIL import Image                                                                       # Importing openCV for image processing
import seaborn as sns                                                            # Importing seaborn to plot graphs


# Tensorflow modules
import tensorflow as tf
from tensorflow import keras
from keras.models import Model
from keras.layers import Dense
from keras.preprocessing.image import ImageDataGenerator               # Importing the ImageDataGenerator for data augmentation
from keras.models import Sequential, Model                             # Importing the sequential module to define a sequential model
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, BatchNormalization, Dropout, LeakyReLU
from keras.applications.vgg16 import VGG16
from keras.utils import img_to_array, load_img, to_categorical
from keras.callbacks import EarlyStopping, ModelCheckpoint

from keras import backend
from random import shuffle
                                                                                  # Defining all the layers to build our CNN Model
from keras.optimizers import Adam, SGD                                 # Importing the optimizers which can be used in our model
from sklearn import preprocessing                                                 # Importing the preprocessing module to preprocess the data
from sklearn.model_selection import train_test_split                              # Importing train_test_split function to split the data into train and test
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix                                      # Importing confusion_matrix to plot the confusion matrix

# Display images using OpenCV
from google.colab.patches import cv2_imshow                                                      # Importing cv2_imshow from google.patches to display images

# Ignore warnings
import warnings
warnings.filterwarnings('ignore')

Let us load the data¶

Note:

  • You must download the dataset from the link provided on Olympus and upload the same to your Google Drive. Then unzip the folder.
In [ ]:
#Storing the path of the data file from the Google drive
path = '/content/drive/MyDrive/Data Science MIT/Projects/Capstone projects/Deep Learning/Malaria Detection/Notebooks/cell_images.zip'

import zipfile as zp

#The data is provided as a zip file so we need to extract the files from the zip file
with zp.ZipFile(path, 'r') as zip_ref:
    zip_ref.extractall()

The extracted folder has different folders for train and test data will contain the different sizes of images for parasitized and uninfected cells within the respective folder name.

The size of all images must be the same and should be converted to 4D arrays so that they can be used as an input for the convolutional neural network. Also, we need to create the labels for both types of images to be able to train and test the model.

Let's do the same for the training data first and then we will use the same code for the test data as well.

TRAIN:

In [ ]:
#Storing the path of the extracted "train" folder
train_dir = '/content/cell_images/train'

#Size of image so that each image has the same size
SIZE = 64

#Empty list to store the training images after they are converted to NumPy arrays
train_images = []

#Empty list to store the training labels (0 - uninfected, 1 - parasitized)
train_labels = []
In [ ]:
#We will run the same code for "parasitized" as well as "uninfected" folders within the "train" folder
for folder_name in ['/parasitized/', '/uninfected/']:

    #Path of the folder
    images_path = os.listdir(train_dir + folder_name)

    for i, image_name in enumerate(images_path):
        try:
            #Opening each image using the path of that image
            image = Image.open(train_dir + folder_name + image_name)

            #Resizing each image to (64,64)
            image = image.resize((SIZE, SIZE))

            #Converting images to arrays and appending that array to the empty list defined above
            train_images.append(np.array(image))

            #Creating labels for parasitized and uninfected images
            if folder_name=='/parasitized/':
                train_labels.append(1)
            else:
                train_labels.append(0)
        except Exception:
            pass

#Converting lists to arrays
train_images = np.array(train_images)
train_labels = np.array(train_labels)
In [ ]:
print(f"Shape of test images: {train_images.shape}")
print(f"Shape of test labels: {train_labels.shape}")
Shape of test images: (24958, 64, 64, 3)
Shape of test labels: (24958,)

TEST:

In [ ]:
#Storing the path of the extracted "test" folder
test_dir = '/content/cell_images/test'

#Size of image so that each image has the same size (it must be same as the train image size)
SIZE = 64

#Empty list to store the testing images after they are converted to NumPy arrays
test_images = []

#Empty list to store the testing labels (0 - uninfected, 1 - parasitized)
test_labels = []
In [ ]:
#We will run the same code for "parasitized" as well as "uninfected" folders within the "test" folder
for folder_name in ['/parasitized/', '/uninfected/']:

    #Path of the folder
    images_path = os.listdir(test_dir + folder_name)

    for i, image_name in enumerate(images_path):
        try:
            #Opening each image using the path of that image
            image = Image.open(test_dir + folder_name + image_name)

            #Resizing each image to (64,64)
            image = image.resize((SIZE, SIZE))

            #Converting images to arrays and appending that array to the empty list defined above
            test_images.append(np.array(image))

            #Creating labels for parasitized and uninfected images
            if folder_name=='/parasitized/':
                test_labels.append(1)
            else:
                test_labels.append(0)
        except Exception:
            pass

#Converting lists to arrays
test_images = np.array(test_images)
test_labels = np.array(test_labels)

Check the shape of train and test images

In [ ]:
print(f"Shape of train images: {train_images.shape}")
print("")
print(f"Shape of test images: {test_images.shape}")
Shape of train images: (24958, 64, 64, 3)

Shape of test images: (2600, 64, 64, 3)

Check the shape of train and test labels

In [ ]:
print(f"Shape of train labels: {train_labels.shape}")
print("")
print(f"Shape of test labels: {test_labels.shape}")
Shape of train labels: (24958,)

Shape of test labels: (2600,)
In [ ]:
train_images[0] #display 3-dimensional NumPy representation of the first image in the training data
Out[ ]:
array([[[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       ...,

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]]], dtype=uint8)
In [ ]:
train_labels[0] #display the label of the first image in the training data
Out[ ]:
1

Observations and insights

  • the training and test image datasets are stored as a 4D Numpy array
    • dimension1: the number of images
      • train: 24,958 images.
      • test: 2,600 images
    • dimension 2 & 3: width and height of image
      • each image in both the test and training set are the same: 64x64 pixels.
    • dimension 4: total number of channels in each image
      • the images are color since they have 3 channels each (i.e. Each pixel in the image has 3 values - the intensity of R, G, and B channels)
  • looking at the first image of the training set, we see how each image is represented by 64 arrays, each of shape 64x3

The aforementioned dimensions indicate that we have a reasonably large dataset.

  • This will be beneficial for training deep learning models

Each image in the training and test datasets is assigned a value of either 1 or 0 using label encoding

  • 1 assigned if 'has malaria' -- paracitized
  • 0 assigned if 'healthy' (no malaria) -- unaffected

Check the minimum and maximum range of pixel values for train and test images

In [ ]:
print(f"Train images - Min: {train_images.min()}, Max: {train_images.max()}")
print(f"Test images - Min: {test_images.min()}, Max: {test_images.max()}")
Train images - Min: 0, Max: 255
Test images - Min: 0, Max: 255

Observations and insights:

  • Pixel intensity values range from 0 to 255, which is typical for color images in the RGB color space.

Note: Normalizing these values to the range 0-1 can help improve the training process as it ensures that the model's input features are on a similar scale.

  • therefore we will normalize the pixel intensity values next

Count the number of values in both uninfected and parasitized

In [ ]:
print(f"Number of training uninfected images: {np.sum(train_labels == 0)}")
print(f"Number of training parasitized images: {np.sum(train_labels == 1)}")
print(f"Number of test uninfected images: {np.sum(test_labels == 0)}")
print(f"Number of test parasitized images: {np.sum(test_labels == 1)}")
Number of training uninfected images: 12376
Number of training parasitized images: 12582
Number of test uninfected images: 1300
Number of test parasitized images: 1300

Normalize the images

In [ ]:
#Normalizing the images by scaling pixel intensities to the range 0-1
train_images_normalized = train_images.astype('float32') / 255.0 #dividing the pixel values by 255 (the maximum RGB value).
test_images_normalized = test_images.astype('float32') / 255.0
In [ ]:
# Function to visualize a few images from an array
def visualize_sample_images(images, title, n=5):
    plt.figure(figsize=(10, 2))
    for i in range(n):
        plt.subplot(1, n, i+1)
        plt.imshow(images[i])
        plt.title(title)
        plt.axis('off')
    plt.show()

visualize_sample_images(train_images_normalized[:5], 'Train Normalized')
visualize_sample_images(test_images_normalized[:5], 'Test Normalized')

Why Normalize?

  • stabalize training
  • because the data comes from various healcare facilities (presumably), normalizing the training images will lead to improved model reliability.
    • this preprocessing step is crucial in medical imaging to compensate for color variations that may arise from differences in lab sample preparation (e.g. staining procedures).
      • Stain normalization serves as a preprocessing step that aligns the color characteristics of tissue samples across varying conditions and inputs. By standardizing the appearance of images, it ensures that the model's performance is not hindered by irrelevant variations, allowing it to focus on the true diagnostic markers present in the tissue.
    • Similarly, it reduces the liklihood of spurious correlations arising from biased data (e.g. if all hospitals used different markers but a large proportion of cell images come from a particular hospital, the model may learn the features of these images more closely).
    • normalization reduces the liklihood of the model inadvertently learning to recognize hopital-specific tokens or markers instead of actual indicators of malaria. spurious correlations lead to flawed conclusions and unreliable prediction models with poor generalization abilities.
    • Such inconsistencies can inadvertently lead to complex, error-prone models as the algorithm may incorrectly learn from the color variability and marker variablilty instead of the actual features of interest.
  • Implementing stain normalization enhances the generalizability and robustness of computer-assisted diagnostic tools. It ensures that the algorithm's accuracy in detecting anomalies, like parasitized cells in the case of malaria, is not compromised by external factors unrelated to the actual pathology of the samples.
In [ ]:
# Print the shapes of the train and test datasets
print('\nshape of Training set:', train_images_normalized.shape, train_labels.shape)
print('New shape of Test set:', test_images_normalized.shape, test_labels.shape)
shape of Training set: (24958, 64, 64, 3) (24958,)
New shape of Test set: (2600, 64, 64, 3) (2600,)

Observations and insights:

  • by counting the number of train and test images and labels, it reveals that the dataset is fairly balanced
    • there is near balance between the two classes (uninfected and parasitized) in both the training and test sets, with each class having approximately 12,000 images in the training set and 1,300 in the test set.
    • this is good because we dont have to do any more preprocessing as the improvement will be marginal
  • Normalization of the pixels intensities is an important step in preprocessing that can lead to faster convergence during training.

Plot to check if the data is balanced

In [ ]:
plt.figure(figsize=(6, 4))
sns.barplot(x=['Train Uninfected', 'Train Parasitized', 'Test Uninfected', 'Test Parasitized'],
            y=[np.sum(train_labels == 0), np.sum(train_labels == 1),
               np.sum(test_labels == 0), np.sum(test_labels == 1)])
plt.title('Data Distribution')
plt.show()

Observations and insights:

  • this bar graph further emphasizes that there is near balance or equal distribution between the two classes
  • this is ideal during model training because these balanced classes will help prevent the model from being biased toward learning the features of the most frequent class

CNNs for the purpose of disease detection in the healthcare domain rely on the quality and diversity of the underlying data, as well as the balance across classes.

Data Preprocessing steps:¶

  • The input image is first processed to remove unwanted noise from the RGB cell image.
    • CNN is a model which is designed to process arrays of data such as images. The first we took when loading in the data was resizing all images because a CNN cannot train images of different sizes.
    • normalization is also a necessary pre-processing step to improve model reliability
      • this is an important step, especially in high stakes domains, such as healthcare, where the consequences of misclassification can have serious consequences. More specifically, the consequences of a false negative malaria test may lead to death that could otherwise have been avoided with accurate early diagnosis
  • We then feed the image as an input to the feature extraction stage, where the output will be the feature vectors. The next stage is the classification stage, where the input will be the feature vectors, and output is the classified label as parasitic and non-parasitic.

Data Exploration¶

Let's visualize the images from the train data

In [ ]:
def visualize_data(images, labels, title, n=5):
    plt.figure(figsize=(10, 2))
    for i in range(n):
        plt.subplot(1, n, i+1)
        idx = np.random.choice(np.where(labels == 1)[0]) if title == "Parasitized" else np.random.choice(np.where(labels == 0)[0])
        plt.imshow(images[idx])
        plt.title(title)
        plt.axis('off')
    plt.show()

visualize_data(train_images, train_labels, "Parasitized")
visualize_data(train_images, train_labels, "Uninfected")

Observations and insights:

The provided visualization showcases samples of parasitized and uninfected blood cell images which are helpful in understanding the visual differences between the two classes:

Patterns in Parasitized Cells:

  • Presence of Parasites: The most striking feature in the parasitized cells is the presence of purple or blue inclusions, which are likely the malaria parasites themselves. These are absent in the uninfected cells.
  • Color Variation: Parasitized cells show more variation in color within the cell. The color ranges from a light pink to a deep purple, indicative of the infection.
  • Irregular Shapes: The shape of the inclusions within the parasitized cells is irregular, which contrasts with the uniformity of the uninfected cells.

Patterns in Uninfected Cells:

  • Uniform Coloration: Uninfected cells tend to have a more uniform color distribution, primarily in shades of pale pink or light blue, without the distinct inclusions seen in parasitized cells.
  • Regular Shape: The uninfected cells are mostly circular with a clear and consistent cell membrane, lacking the irregular features of the parasitized cells.

These distinct differential visual features should be captured well by CNNs

  • CNNs can learn to recognize patterns in color, shape, and texture that are indicative of parasitic infection.

comparing differential visual features of the 2 classes will help in informing the design of the CNN layers and the choice of filters to effectively capture the characteristics that differentiate parasitized from uninfected cells. For example:

  • Inclusions: The presence or absence of inclusions is a primary visual cue. The malaria parasites appear as distinct entities within the cells and can be used as a primary feature for classification.
  • Color Intensity and Distribution: Parasitized cells exhibit a range of color intensities, while uninfected cells are more homogenous. A model can be trained to pick up on these subtleties.
  • Texture and Complexity: The texture within parasitized cells is more complex due to the presence of the parasite. Machine learning models can be trained to differentiate between the smooth texture of uninfected cells and the complex texture of parasitized ones.

Visualize the images with subplot(6, 6) and figsize = (12, 12)

In [ ]:
def visualize_data_large(images, labels, title, n=6):
    plt.figure(figsize=(12, 12))
    for i in range(n*n):
        plt.subplot(n, n, i+1)
        idx = np.random.choice(np.where(labels == 1)[0]) if title == "Parasitized" else np.random.choice(np.where(labels == 0)[0])
        plt.imshow(images[idx])
        plt.title(title)
        plt.axis('off')
    plt.tight_layout()
    plt.show()

visualize_data_large(train_images, train_labels, "Parasitized", 6)
visualize_data_large(train_images, train_labels, "Uninfected", 6)

Observations and insights:

  • some cells that appear to be parasitized are labeled 'uninfected' and vice versa
  • in this context, we are most concerned about infected cells being incorrectly labeled as 'uninfected' because the consequences of not detecting the disease when it is present outweigh the negative consequences of treating a person a person that does not actually have malaria

this finding raises questions about the integrity of the training set

  • this is an important consideration because such misclassification in the training set puts inherent limitations on the accuracy of the prediction models
  • in this context, false negative diagnosis is what we want to avoid
  • therefore, it would be a good exercise to eliminate any false negatives from the training data before building CNN models

Plotting the mean images for parasitized and uninfected

In [ ]:
avg_parasitized = train_images[train_labels == 1].mean(axis=0)
avg_uninfected = train_images[train_labels == 0].mean(axis=0)

Mean image for parasitized vs mean image for Uninfected:

In [ ]:
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.imshow(avg_parasitized.astype('uint8'))
plt.title('Average Parasitized Cell')
plt.axis('off')

plt.subplot(1, 2, 2)
plt.imshow(avg_uninfected.astype('uint8'))
plt.title('Average Uninfected Cell')
plt.axis('off')
plt.show()

Observations and insights:

  • these visualizations are generated by averaging the pixel intensities across many images in each class and can provide a generalized view of the distinguishing features between different classes.

Comparison between Average Parasitized Cell and Average Uninfected Cell:

  • Both images have a brighter center, which is typical for cellular images due to the lighting conditions under which the samples are imaged.
    • However, the average parasitized cell appears to have a more defined and possibly darker central region than the uninfected cell, which might be indicative of the presence of the parasite.
  • The edges of the average parasitized cell seem to be darker and more distinct than those of the uninfected cell.
    • This could suggest that parasitized cells have a different edge profile, which could be a feature learned by the classification model.
    • The average parasitized cell has a darker color overall color
      • this makes sense because a feature of the paracitized cell is the presence of a dark parasite within the cell

Note: Overall, the analysis of average images is a valuable part of exploratory data analysis, offering a visual summary that can support various stages of model development and provide insights into the data that may not be immediately apparent from individual images.

Additional EDA:

Analyze the color intensity distribution of the images

In [ ]:
def plot_color_distribution(images, labels, title):
    if title == 'Parasitized':
        mask = labels == 1
    else:
        mask = labels == 0
    color_data = images[mask].mean(axis=(1,2))
    sns.histplot(color_data, bins=30, kde=True)
    plt.title(f'Color Intensity Distribution in {title} Images')
    plt.xlabel('Pixel Intensity')
    plt.ylabel('Frequency')
    plt.show()

plot_color_distribution(train_images, train_labels, "Parasitized")
plot_color_distribution(train_images, train_labels, "Uninfected")

Observations: Color Intensity Distribution Analysis

  • parasitized images have a consistently higher intensity of blue compared to uninfected cells
  • the blue channel has a distinct distribution pattern that differs between between parasitized and uninfected images
    • this observation is useful because it may suggest that the blue channel might be more significant for classification (ML models can potentially learn to focus on these differences to improve classification accuracy)
  • in both classes, it is evident that the green and red channel consistently overlap over the entire distribution
    • such a degree of overlapping color intensities suggest that analysis of the pixel intensity of the red and green channels as a means of distinguishing between parasitized and uninfected cells may not be useful
    • The degree of overlap between the distributions of the two classes indicates how challenging it might be to distinguish between them based on color intensity alone and other features might need to be considered.

Converting RGB to HSV of Images using OpenCV

Converting the train data and test

In [ ]:
def convert_to_hsv(images):
    hsv_images = []
    for img in images:
        hsv_img = cv2.cvtColor(img, cv2.COLOR_RGB2HSV)
        hsv_images.append(hsv_img)
    return np.array(hsv_images)

# Convert train and test images from RGB to HSV
train_images_hsv = convert_to_hsv(train_images)
test_images_hsv = convert_to_hsv(test_images)

Check the shape of converted arrays

In [ ]:
print(f"Shape of HSV train images: {train_images_hsv.shape}")
print(f"Shape of HSV test images: {test_images_hsv.shape}")
Shape of HSV train images: (24958, 64, 64, 3)
Shape of HSV test images: (2600, 64, 64, 3)

Processing Images using Gaussian Blurring

Gaussian Blurring on train data and test data

In [ ]:
def apply_gaussian_blur(images, kernel_size=(5, 5)):
    blurred_images = []
    for img in images:
        blurred_img = cv2.GaussianBlur(img, kernel_size, 0)
        blurred_images.append(blurred_img)
    return np.array(blurred_images)

# Apply Gaussian blurring to train and test images
train_images_blurred = apply_gaussian_blur(train_images)
test_images_blurred = apply_gaussian_blur(test_images)

check the shape of blurred image arrays

In [ ]:
#check the shape of the blurred image arrays
print(f"Shape of blurred train images: {train_images_blurred.shape}")
print(f"Shape of blurred test images: {test_images_blurred.shape}")
Shape of blurred train images: (24958, 64, 64, 3)
Shape of blurred test images: (2600, 64, 64, 3)

Additional Data Processing steps¶

Visulaizing some of the HSV and blurred images

In [ ]:
# Function to visualize a few images from an array
def visualize_sample_images(images, title, n=5):
    plt.figure(figsize=(10, 2))
    for i in range(n):
        plt.subplot(1, n, i+1)
        plt.imshow(images[i])
        plt.title(title)
        plt.axis('off')
    plt.show()

visualize_sample_images(train_images_hsv[:5], 'HSV Images')
visualize_sample_images(train_images_blurred[:5], 'Blurred Images')

Blurring

  • this might be a useful preprocessing step to reduce noise in an image by reducing detail
  • it can help the model focus on more significant patterns rather than small details or noise.
In [ ]:
# Check if labels need encoding
print(f"Unique labels in training set: {np.unique(train_labels)}")
print(f"Unique labels in test set: {np.unique(train_labels)}")
Unique labels in training set: [0 1]
Unique labels in test set: [0 1]

We need to do one-hot encoding here because each outcome is currently a vector of dimension 2.

One Hot Encoding the train and test labels

In [ ]:
train_labels_encoded = to_categorical(train_labels)
test_labels_encoded = to_categorical(test_labels)

Building the Models¶

Base Model¶

Note: The Base Model has been fully built and evaluated with all outputs shown to give an idea about the process of the creation and evaluation of the performance of a CNN architecture. A similar process can be followed in iterating to build better-performing CNN architectures.

In [ ]:
# Fixing the seed for random number generators
np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)

Importing the required libraries for building and training our Model

In [ ]:
#libraries have all been imported

Building the model

In [ ]:
model = Sequential()

model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(SIZE, SIZE, 3), padding = "same"))
model.add(MaxPooling2D((2, 2)))

model.add(Conv2D(64, (3, 3), activation='relu', padding = "same"))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.1))

model.add(Conv2D(32, (3, 3), activation='relu', padding = "same"))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.1))

model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))  # 2 because of two classes

model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 64, 64, 32)        896       
                                                                 
 max_pooling2d (MaxPooling2  (None, 32, 32, 32)        0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 32, 32, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 16, 16, 64)        0         
 g2D)                                                            
                                                                 
 dropout_2 (Dropout)         (None, 16, 16, 64)        0         
                                                                 
 conv2d_2 (Conv2D)           (None, 16, 16, 32)        18464     
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 8, 8, 32)          0         
 g2D)                                                            
                                                                 
 dropout_3 (Dropout)         (None, 8, 8, 32)          0         
                                                                 
 flatten_1 (Flatten)         (None, 2048)              0         
                                                                 
 dense_3 (Dense)             (None, 256)               524544    
                                                                 
 dropout_4 (Dropout)         (None, 256)               0         
                                                                 
 dense_4 (Dense)             (None, 2)                 514       
                                                                 
=================================================================
Total params: 562914 (2.15 MB)
Trainable params: 562914 (2.15 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Compiling the model

In [ ]:
model.compile(optimizer=Adam(learning_rate=0.001), loss='binary_crossentropy', metrics=['accuracy'])

Using Callbacks

In [ ]:
early_stopping = EarlyStopping(monitor='val_loss', patience=5)
model_checkpoint = ModelCheckpoint('best_model.h5', monitor='val_loss', save_best_only=True)

Fit and train our Model

In [ ]:
history = model.fit(train_images_normalized, train_labels_encoded, epochs=20, batch_size=32, callbacks=[early_stopping, model_checkpoint], validation_split = 0.2, verbose = 1)
Epoch 1/20
624/624 [==============================] - 45s 69ms/step - loss: 0.2339 - accuracy: 0.9016 - val_loss: 0.1241 - val_accuracy: 0.9862
Epoch 2/20
624/624 [==============================] - 41s 66ms/step - loss: 0.0708 - accuracy: 0.9762 - val_loss: 0.0555 - val_accuracy: 0.9876
Epoch 3/20
624/624 [==============================] - 42s 67ms/step - loss: 0.0621 - accuracy: 0.9791 - val_loss: 0.0645 - val_accuracy: 0.9824
Epoch 4/20
624/624 [==============================] - 41s 66ms/step - loss: 0.0554 - accuracy: 0.9808 - val_loss: 0.0490 - val_accuracy: 0.9838
Epoch 5/20
624/624 [==============================] - 41s 65ms/step - loss: 0.0553 - accuracy: 0.9805 - val_loss: 0.0503 - val_accuracy: 0.9848
Epoch 6/20
624/624 [==============================] - 41s 66ms/step - loss: 0.0499 - accuracy: 0.9822 - val_loss: 0.0836 - val_accuracy: 0.9750
Epoch 7/20
624/624 [==============================] - 42s 67ms/step - loss: 0.0467 - accuracy: 0.9831 - val_loss: 0.1136 - val_accuracy: 0.9603
Epoch 8/20
624/624 [==============================] - 42s 67ms/step - loss: 0.0457 - accuracy: 0.9830 - val_loss: 0.0522 - val_accuracy: 0.9852
Epoch 9/20
624/624 [==============================] - 41s 66ms/step - loss: 0.0463 - accuracy: 0.9833 - val_loss: 0.0665 - val_accuracy: 0.9798

Evaluating the model on test data

In [ ]:
# Load the best model saved by ModelCheckpoint
model = keras.models.load_model('best_model.h5')

# Evaluate the model on test data
test_loss, test_accuracy = model.evaluate(test_images_normalized, test_labels_encoded)
print(f"Test Loss: {test_loss}, Test Accuracy: {test_accuracy}")
82/82 [==============================] - 2s 17ms/step - loss: 0.0433 - accuracy: 0.9842
Test Loss: 0.04329649731516838, Test Accuracy: 0.9842307567596436

Plotting the confusion matrix and classification report

In [ ]:
# Predictions
predictions = model.predict(test_images_normalized)
predictions = np.argmax(predictions, axis=1) # Convert one-hot to index
test_labels_decoded = np.argmax(test_labels_encoded, axis=1) # Convert one-hot to index

#Print results using a classification report
print(classification_report(test_labels_decoded, predictions))

# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(test_labels_decoded, predictions)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax
)

# Setting the labels to both the axes
ax.set_xlabel('Predicted labels');ax.set_ylabel('True labels');
ax.set_title('Confusion Matrix');
ax.xaxis.set_ticklabels(['uninfected', 'Paracitized'])
ax.yaxis.set_ticklabels(['uninfected', 'Paracitized'])
plt.show()

##plt.xlabel('Predicted')
##plt.ylabel('True')
##plt.show()
82/82 [==============================] - 1s 17ms/step
              precision    recall  f1-score   support

           0       0.98      0.99      0.98      1300
           1       0.99      0.98      0.98      1300

    accuracy                           0.98      2600
   macro avg       0.98      0.98      0.98      2600
weighted avg       0.98      0.98      0.98      2600

Plotting the train and validation curves

In [ ]:
#plotting the training and validation accuracy

# Extracting accuracy and validation accuracy from the cnn_history_1 object
accuracy = history.history['accuracy']
val_accuracy = history.history['val_accuracy']

# Generating a range for epochs starting from 1 to the length of the accuracy list
epochs = range(1, len(accuracy) + 1)

# Creating a plot
plt.figure(figsize=(8, 8))

# Plotting both the training accuracy and validation accuracy
plt.plot(epochs, accuracy, 'b--', label='Training Accuracy')  # Blue dashed line for training accuracy
plt.plot(epochs, val_accuracy, 'r--', label='Validation Accuracy')  # Red dashed line for validation accuracy

# Adding labels and title for clarity
plt.title('Training and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')

# Showing the legend
plt.legend(['Train', 'Validation'], loc='upper left')


# Displaying the plot
plt.show()

Overall Observations: Base Model ("model")

  • The model has a reasonable number of parameters (over 2 million).

    • This suggests the model is complex enough to learn from the data but not too large for the given task.
  • the dropout layers after convolution layers were incorporated in the base model in attempt to avoid the overfitting problem

    • generally good practice in deep learning models.
  • The validation set is used during the training process to evaluate the model's performance on unseen data, which helps in tuning hyperparameters and avoiding overfitting.

  • The model achieves high training accuracy, indicating it has learned the training data well.

  • Validation accuracy is also high and follows the training accuracy closely, which is a positive sign that the model is generalizing well to unseen data.

  • in addition, because the model's performance on the validation set is less than the test set, it indicates that the model is not overfitting

  • The test accuracy is approximately 98%, which is excellent and indicates the model has generalized well from the training data to the test data.

  • However, we need to look at recall for the 'parasitized' class to assess the model's performance in terms of minimizing false negatives.

    • the recall for class 1 is 0.98 which is very good.
  • The classification report shows high precision, recall, and F1-score for both classes, which is exceptional.

  • Given the high recall for both classes (0.98 for 'parasitized'), the model appears to be successful in minimizing false negatives, which is crucial for medical diagnostics.

Given the high performance of the base model, there may not be much room for improvement. However, in a medical context, even small improvements in recall can be significant.

Here are some suggestions:

  • Fine-Tuning: Experiment with fine-tuning the dropout rates and adding more convolutional layers to see if there's any gain in performance.
  • Data Augmentation:Further augment the training data to improve the model's ability to generalize, especially if more data cannot be obtained.

So now let's try to build another model with few more add on layers and try to check if we can try to improve the model. Therefore try to build a model by adding few layers if required and altering the activation functions.

Model 1

Trying to improve the performance of our model by adding new layers

In [ ]:
backend.clear_session() # Clearing the backend for new model

np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)

Building the Model

In [ ]:
Model1 = Sequential()

Model1.add(Conv2D(32, (3, 3), activation='relu', input_shape=(SIZE, SIZE, 3), padding="same"))
Model1.add(MaxPooling2D((2, 2)))
Model1.add(Dropout(0.2))


Model1.add(Conv2D(64, (3, 3), activation='relu', padding="same"))
Model1.add(MaxPooling2D((2, 2)))
Model1.add(Dropout(0.3))


Model1.add(Conv2D(128, (3, 3), activation='relu', padding="same"))
Model1.add(MaxPooling2D((2, 2)))
Model1.add(Dropout(0.4))

# Added Fourth Convolutional Block
Model1.add(Conv2D(256, (3, 3), activation='relu', padding="same"))
Model1.add(MaxPooling2D((2, 2)))
Model1.add(Dropout(0.5))

# Flattening and Fully Connected Layers
Model1.add(Flatten())
Model1.add(Dense(128, activation='relu'))
Model1.add(Dropout(0.6))
Model1.add(Dense(2, activation='softmax'))  # 2 for two classes

# Display the model's architecture
Model1.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 64, 64, 32)        896       
                                                                 
 max_pooling2d (MaxPooling2  (None, 32, 32, 32)        0         
 D)                                                              
                                                                 
 dropout (Dropout)           (None, 32, 32, 32)        0         
                                                                 
 conv2d_1 (Conv2D)           (None, 32, 32, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 16, 16, 64)        0         
 g2D)                                                            
                                                                 
 dropout_1 (Dropout)         (None, 16, 16, 64)        0         
                                                                 
 conv2d_2 (Conv2D)           (None, 16, 16, 128)       73856     
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 8, 8, 128)         0         
 g2D)                                                            
                                                                 
 dropout_2 (Dropout)         (None, 8, 8, 128)         0         
                                                                 
 conv2d_3 (Conv2D)           (None, 8, 8, 256)         295168    
                                                                 
 max_pooling2d_3 (MaxPoolin  (None, 4, 4, 256)         0         
 g2D)                                                            
                                                                 
 dropout_3 (Dropout)         (None, 4, 4, 256)         0         
                                                                 
 flatten (Flatten)           (None, 4096)              0         
                                                                 
 dense (Dense)               (None, 128)               524416    
                                                                 
 dropout_4 (Dropout)         (None, 128)               0         
                                                                 
 dense_1 (Dense)             (None, 2)                 258       
                                                                 
=================================================================
Total params: 913090 (3.48 MB)
Trainable params: 913090 (3.48 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Compiling the model

In [ ]:
Model1.compile(optimizer=Adam(learning_rate=0.001), loss='binary_crossentropy', metrics=['accuracy'])

Using Callbacks

In [ ]:
early_stopping = EarlyStopping(monitor='val_loss', patience=5)
model_checkpoint = ModelCheckpoint('best_model.h5', monitor='val_loss', save_best_only=True)

Fit and Train the model

In [ ]:
history1 = Model1.fit(train_images_normalized, train_labels_encoded, epochs=20, batch_size=32, callbacks=[early_stopping, model_checkpoint], validation_split = 0.2, verbose = 1)
Epoch 1/20
624/624 [==============================] - 64s 101ms/step - loss: 0.2405 - accuracy: 0.8947 - val_loss: 0.0885 - val_accuracy: 0.9846
Epoch 2/20
624/624 [==============================] - 62s 99ms/step - loss: 0.0839 - accuracy: 0.9738 - val_loss: 0.0739 - val_accuracy: 0.9792
Epoch 3/20
624/624 [==============================] - 62s 99ms/step - loss: 0.0805 - accuracy: 0.9754 - val_loss: 0.0485 - val_accuracy: 0.9836
Epoch 4/20
624/624 [==============================] - 62s 100ms/step - loss: 0.0758 - accuracy: 0.9752 - val_loss: 0.0681 - val_accuracy: 0.9742
Epoch 5/20
624/624 [==============================] - 63s 102ms/step - loss: 0.0710 - accuracy: 0.9777 - val_loss: 0.0921 - val_accuracy: 0.9728
Epoch 6/20
624/624 [==============================] - 70s 112ms/step - loss: 0.0714 - accuracy: 0.9765 - val_loss: 0.0516 - val_accuracy: 0.9832
Epoch 7/20
624/624 [==============================] - 63s 101ms/step - loss: 0.0709 - accuracy: 0.9778 - val_loss: 0.0991 - val_accuracy: 0.9613
Epoch 8/20
624/624 [==============================] - 63s 101ms/step - loss: 0.0688 - accuracy: 0.9789 - val_loss: 0.0703 - val_accuracy: 0.9736

Evaluating the model

In [ ]:
# Load the best model saved by ModelCheckpoint
Model1 = keras.models.load_model('best_model.h5')

# Evaluate the model on test data
test_loss, test_accuracy = Model1.evaluate(test_images_normalized, test_labels_encoded)
print(f"Test Loss: {test_loss}, Test Accuracy: {test_accuracy}")
82/82 [==============================] - 3s 33ms/step - loss: 0.0521 - accuracy: 0.9831
Test Loss: 0.05211582034826279, Test Accuracy: 0.9830769300460815

Plotting the confusion matrix

In [ ]:
# Predictions
predictions = Model1.predict(test_images_normalized)
predictions = np.argmax(predictions, axis=1)
test_labels_decoded = np.argmax(test_labels_encoded, axis=1)

#Print results using a classification report
print(classification_report(test_labels_decoded, predictions))

# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(test_labels_decoded, predictions)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax
)
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()
82/82 [==============================] - 2s 27ms/step
              precision    recall  f1-score   support

           0       0.98      0.98      0.98      1300
           1       0.98      0.98      0.98      1300

    accuracy                           0.98      2600
   macro avg       0.98      0.98      0.98      2600
weighted avg       0.98      0.98      0.98      2600

Plotting the train and the validation curves

In [ ]:
#plotting the training and validation accuracy

# Extracting accuracy and validation accuracy from the history object
accuracy = history1.history['accuracy']
val_accuracy = history1.history['val_accuracy']

# Generating a range for epochs starting from 1 to the length of the accuracy list
epochs = range(1, len(accuracy) + 1)

# Creating a plot
plt.figure(figsize=(8, 8))

# Plotting both the training accuracy and validation accuracy
plt.plot(epochs, accuracy, 'b--', label='Training Accuracy')  # Blue dashed line for training accuracy
plt.plot(epochs, val_accuracy, 'r--', label='Validation Accuracy')  # Red dashed line for validation accuracy

# Adding labels and title for clarity
plt.title('Training and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')

# Showing the legend
plt.legend()

# Displaying the plot
plt.show()

Insights and Observations for Model1:

  • The model has a more complex architecture than the base model, with additional convolutional layers and corresponding dropout layers to prevent overfitting.
  • The model's parameter count is around 3.48 million, which is quite substantial, although less than the base model, and suggests the model has the capacity to learn detailed features.

Training Process:

  • The training logs show that the model's accuracy improved rapidly and then plateaued, which is typical and desirable in a training process. It starts with an accuracy of around 88% and ends at around 98% for training accuracy. The validation accuracy closely tracks the training accuracy, ending at around 98%, which indicates that the model generalizes well to unseen data. Test Performance: On the test data, the model achieved an accuracy of around 98%, which is very high and comparable to the base model.
  • the validation accuracy is slighly lower than the test accuracy, which is good because it suggests that the model is not overfitting the train data

Classification Report:

  • Precision, recall, and F1-score for both classes are very high, close to 0.98. The high recall for the positive class (0.98) is particularly important for medical applications where minimizing false negatives is crucial.

Confusion Matrix:

  • The confusion matrix indicates a small number of false negatives (around 30 out of 1300), which is crucial for medical diagnostics as it's essential to minimize the cases where the model fails to identify parasitized cells.

Training and Validation Accuracy Graph:

  • The accuracy graph shows the model converges well without signs of overfitting, as the validation curve remains close to the training curve.

Specific Observations and Model Improvement Suggestions:

  • Overfitting: There is no strong evidence of overfitting, as validation accuracy is in line with training accuracy.
  • The dropout rates used seem adequate, but could possibly be fine-tuned if overfitting becomes apparent with additional epochs.
  • Class Imbalance: Given the balanced dataset, the model likely does not suffer from class imbalance. However, in case of class imbalance in practical scenarios, techniques like class weights, more aggressive data augmentation, or oversampling of the minority class could be used.
  • Further Tuning: Given the excellent performance, further improvements may be marginal but could involve hyperparameter tuning such as adjusting the learning rate, trying different optimizers, or experimenting with different activation functions.

Model 2: LeakyRelu and Batch Normalization¶

Think about it:

Now let's build a model with LeakyRelu as the activation function

  • Can the model performance be improved if we change our activation function to LeakyRelu?
  • Can BatchNormalization improve our model?

Let us try to build a model using BatchNormalization and using LeakyRelu as our activation function.

In [ ]:
backend.clear_session() # Clearing the backend for new model

np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)

Building the Model

In [ ]:
Model2 = Sequential() #trial

# First Convolutional Block with LeakyReLU and Batch Normalization
Model2.add(Conv2D(64, (3, 3), padding="same", input_shape=(SIZE, SIZE, 3)))
Model2.add(LeakyReLU(alpha=0.1))
Model2.add(BatchNormalization())
Model2.add(MaxPooling2D((2, 2)))
Model2.add(Dropout(0.2))

# Second Convolutional Block
Model2.add(Conv2D(128, (3, 3), padding="same"))
Model2.add(LeakyReLU(alpha=0.1))
Model2.add(BatchNormalization())
Model2.add(MaxPooling2D((2, 2)))
Model2.add(Dropout(0.3))

# Third Convolutional Block
Model2.add(Conv2D(256, (3, 3), padding="same"))
Model2.add(LeakyReLU(alpha=0.1))
Model2.add(BatchNormalization())
Model2.add(MaxPooling2D((2, 2)))
Model2.add(Dropout(0.4))

# Flattening and Fully Connected Layers
Model2.add(Flatten())
Model2.add(Dense(128))
Model2.add(LeakyReLU(alpha=0.1))
Model2.add(BatchNormalization())
Model2.add(Dropout(0.6))
Model2.add(Dense(2, activation='softmax'))  # 2 for two classes

# Display the model's architecture
Model2.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 64, 64, 64)        1792      
                                                                 
 leaky_re_lu (LeakyReLU)     (None, 64, 64, 64)        0         
                                                                 
 batch_normalization (Batch  (None, 64, 64, 64)        256       
 Normalization)                                                  
                                                                 
 max_pooling2d (MaxPooling2  (None, 32, 32, 64)        0         
 D)                                                              
                                                                 
 dropout (Dropout)           (None, 32, 32, 64)        0         
                                                                 
 conv2d_1 (Conv2D)           (None, 32, 32, 128)       73856     
                                                                 
 leaky_re_lu_1 (LeakyReLU)   (None, 32, 32, 128)       0         
                                                                 
 batch_normalization_1 (Bat  (None, 32, 32, 128)       512       
 chNormalization)                                                
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 16, 16, 128)       0         
 g2D)                                                            
                                                                 
 dropout_1 (Dropout)         (None, 16, 16, 128)       0         
                                                                 
 conv2d_2 (Conv2D)           (None, 16, 16, 256)       295168    
                                                                 
 leaky_re_lu_2 (LeakyReLU)   (None, 16, 16, 256)       0         
                                                                 
 batch_normalization_2 (Bat  (None, 16, 16, 256)       1024      
 chNormalization)                                                
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 8, 8, 256)         0         
 g2D)                                                            
                                                                 
 dropout_2 (Dropout)         (None, 8, 8, 256)         0         
                                                                 
 flatten (Flatten)           (None, 16384)             0         
                                                                 
 dense (Dense)               (None, 128)               2097280   
                                                                 
 leaky_re_lu_3 (LeakyReLU)   (None, 128)               0         
                                                                 
 batch_normalization_3 (Bat  (None, 128)               512       
 chNormalization)                                                
                                                                 
 dropout_3 (Dropout)         (None, 128)               0         
                                                                 
 dense_1 (Dense)             (None, 2)                 258       
                                                                 
=================================================================
Total params: 2470658 (9.42 MB)
Trainable params: 2469506 (9.42 MB)
Non-trainable params: 1152 (4.50 KB)
_________________________________________________________________

Compiling the model

In [ ]:
Model2.compile(optimizer=Adam(learning_rate=0.001), loss='binary_crossentropy', metrics=['accuracy'])

Using callbacks

In [ ]:
early_stopping = EarlyStopping(monitor='val_loss', patience=5)
model_checkpoint = ModelCheckpoint('best_model.h5', monitor='val_loss', save_best_only=True)

Fit and train the model

In [ ]:
history2 = Model2.fit(train_images_normalized, train_labels_encoded, epochs=20, batch_size=32, callbacks=[early_stopping, model_checkpoint], validation_split = 0.2, verbose = 1)
Epoch 1/20
624/624 [==============================] - 187s 296ms/step - loss: 0.4304 - accuracy: 0.8245 - val_loss: 0.0670 - val_accuracy: 0.9916
Epoch 2/20
624/624 [==============================] - 184s 294ms/step - loss: 0.1325 - accuracy: 0.9559 - val_loss: 0.0225 - val_accuracy: 0.9920
Epoch 3/20
624/624 [==============================] - 183s 293ms/step - loss: 0.0960 - accuracy: 0.9677 - val_loss: 0.0176 - val_accuracy: 0.9952
Epoch 4/20
624/624 [==============================] - 183s 293ms/step - loss: 0.0882 - accuracy: 0.9720 - val_loss: 0.0164 - val_accuracy: 0.9938
Epoch 5/20
624/624 [==============================] - 182s 292ms/step - loss: 0.0795 - accuracy: 0.9733 - val_loss: 0.0604 - val_accuracy: 0.9832
Epoch 6/20
624/624 [==============================] - 184s 295ms/step - loss: 0.0768 - accuracy: 0.9744 - val_loss: 0.0320 - val_accuracy: 0.9878
Epoch 7/20
624/624 [==============================] - 185s 297ms/step - loss: 0.0719 - accuracy: 0.9770 - val_loss: 0.0384 - val_accuracy: 0.9878
Epoch 8/20
624/624 [==============================] - 197s 315ms/step - loss: 0.0691 - accuracy: 0.9771 - val_loss: 0.0454 - val_accuracy: 0.9840
Epoch 9/20
624/624 [==============================] - 194s 311ms/step - loss: 0.0660 - accuracy: 0.9784 - val_loss: 0.0284 - val_accuracy: 0.9906

Evaluating the model

In [ ]:
# Load the best model saved by ModelCheckpoint
Model2 = keras.models.load_model('best_model.h5')

# Evaluate the model on test data
test_loss, test_accuracy = Model2.evaluate(test_images_normalized, test_labels_encoded)
print(f"Test Loss: {test_loss}, Test Accuracy: {test_accuracy}")
82/82 [==============================] - 6s 76ms/step - loss: 0.0735 - accuracy: 0.9773
Test Loss: 0.07351841777563095, Test Accuracy: 0.9773076772689819

Plotting the train and validation accuracy

In [ ]:
#plotting the training and validation accuracy

# Extracting accuracy and validation accuracy from the history object
accuracy = history2.history['accuracy']
val_accuracy = history2.history['val_accuracy']

# Generating a range for epochs starting from 1 to the length of the accuracy list
epochs = range(1, len(accuracy) + 1)

# Creating a plot
plt.figure(figsize=(8, 8))

# Plotting both the training accuracy and validation accuracy
plt.plot(epochs, accuracy, 'b--', label='Training Accuracy')  # Blue dashed line for training accuracy
plt.plot(epochs, val_accuracy, 'r--', label='Validation Accuracy')  # Red dashed line for validation accuracy

# Adding labels and title for clarity
plt.title('Training and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')

# Showing the legend
plt.legend()

# Displaying the plot
plt.show()

Generate the classification report and confusion matrix

In [ ]:
# Predictions
predictions = Model2.predict(test_images_normalized)
predictions = np.argmax(predictions, axis=1) # Convert one-hot to index
test_labels_decoded = np.argmax(test_labels_encoded, axis=1) # Convert one-hot to index

#Print results using a classification report
print(classification_report(test_labels_decoded, predictions))

# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(test_labels_decoded, predictions)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax
)
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()
82/82 [==============================] - 6s 71ms/step
              precision    recall  f1-score   support

           0       0.96      0.99      0.98      1300
           1       0.99      0.96      0.98      1300

    accuracy                           0.98      2600
   macro avg       0.98      0.98      0.98      2600
weighted avg       0.98      0.98      0.98      2600

Model 2 Observations and insights:

Model Architecture:

  • sequential structure with increasing complexity in convolutional blocks, and each convolutional block is followed by batch normalization and dropout layers. This structure is designed to capture increasingly abstract representations of the input data.
  • The model uses LeakyReLU as the activation function, which can help to address the problem of "dying ReLUs" by allowing a small, non-zero gradient when the unit is not active.
  • A progressive increase in dropout rate with the depth of the network is employed, which is a strategy to mitigate overfitting in deeper networks.

Training Performance:

  • the training graph shows consistent improvement in accuracy over epochs with both training and validation accuracy stabilizing around 98%, which is a strong indicator of good model performance
  • The loss values decrease over time, which aligns with the improvement in accuracy
    • this suggests that the model is effectively learning from the training data.

Test Performance:

  • The model achieves a test accuracy of 97.73% and a test loss of 0.073, which are solid results and indicate that the model has generalized well from the training to the test data.
  • the validation accuracy is consistently above the test accuracy throughout the training epochs. This indicates that the model overfits to the training data

Classification Report and Confusion Matrix:

  • The precision, recall, and F1-scores are all high (around 0.98), with a particularly strong performance in recall for the positive class (0.96), which is crucial for medical applications to minimize false negatives.
    • The confusion matrix reveals a higher number of false negatives (50 out of 1300) compared to the base model and Model1
    • although this is still a relatively low false negative rate of 3.8%, in a medical context, each false negative can be significant, and so this might be an area to focus on for improvement.

Training and Validation Accuracy Graph: The accuracy graph demonstrates that the model trains well with high accuracy, but there is a slight divergence between training and validation accuracy, which could be an early sign of overfitting.

Specific Observations and Model Improvement Suggestions:

False Negatives: To reduce the number of false negatives further, we can try further increasing the complexity of the model or using techniques like focal loss, which gives more weight to harder-to-classify examples.

Overfitting Check: Given the slight divergence between training and validation accuracy, monitor for overfitting by possibly introducing further regularization.

Hyperparameter Tuning: Experimenting with different values for the dropout rate and the alpha parameter of the LeakyReLU activation function may yield improvements. Additionally, a learning rate schedule could be introduced to fine-tune the learning process over epochs.

Model 3: Data Augmentation¶

Think About It :

  • Can we improve the model with Image Data Augmentation?
  • References to image data augmentation can be seen below:
    • Image Augmentation for Computer Vision
    • How to Configure Image Data Augmentation in Keras?
In [ ]:
backend.clear_session() # Clearing the backend for new model

np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)
In [ ]:
# Split the training data to create a validation set
X_train, X_val, y_train, y_val = train_test_split(train_images, train_labels, test_size=0.2, random_state=42)

We will analyze data augmented in 2 different ways

Use image data generator

Model 3a¶

Augmentation Parameters for Model 3a and Normalization:

In [ ]:
# Creating an training data generator with different augmentations
train_datagen = ImageDataGenerator(
    rescale=1./255,           # Normalizing the pixel values
    shear_range=0.3,
    rotation_range=20,
    ##shear_range=0.2,          # Shear Intensity (Shear angle in counter-clockwise direction in degrees)
    ##width_shift_range=0.1,    # Fraction of total width for horizontal shift
    ##height_shift_range=0.1,   # Fraction of total height for vertical shift
    brightness_range=[0.8,1.2], # Range for picking a brightness shift value
    zoom_range=0.1,           # Range for random zoom
    vertical_flip=True        # Randomly flip inputs vertically
)

# Creating the validation data generator (normalization only, no augmentation)
val_datagen = ImageDataGenerator(rescale=1./255)

Setting up the Training and Validation Generator:

In [ ]:
# Setting up the training generator
train_generator = train_datagen.flow(
    x=X_train,
    y=y_train,
    batch_size=64,
    seed=42,
    shuffle=True
)

# Setting up the validation generator
val_generator = val_datagen.flow(
    x=X_val,
    y=y_val,
    batch_size=64,
    seed=42,
    shuffle=True
)

Visualizing the first set of Augmented images (Input for model3a)

In [ ]:
# Function to visualize images
def visualize_augmented_images(image_generator, n_images):
    # Get a batch of images
    images, labels = next(image_generator)

    # Set up the grid
    plt.figure(figsize=(10, 10))
    for i in range(n_images):
        plt.subplot(n_images // 4 + 1, 4, i + 1)
        plt.imshow(images[i])
        plt.title('Augmented Image')
        plt.axis('off')
    plt.tight_layout()
    plt.show()

# Visualize some augmented images
visualize_augmented_images(train_generator, 8)

Building Model 3a¶

In [ ]:
model3a = Sequential()

# Convolutional Block
model3a.add(Conv2D(32, (3, 3), activation='relu', padding='same', input_shape=(SIZE, SIZE, 3)))
model3a.add(BatchNormalization())
model3a.add(MaxPooling2D((2, 2)))
model3a.add(Dropout(0.2))

# Flatten and Fully Connected Layers
model3a.add(Flatten())
model3a.add(Dense(128, activation='relu'))
model3a.add(BatchNormalization())
model3a.add(Dropout(0.3))

# Output Layer for binary classification
model3a.add(Dense(1, activation='sigmoid'))  #  1 output node for binary classification

model3a.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 64, 64, 32)        896       
                                                                 
 batch_normalization (Batch  (None, 64, 64, 32)        128       
 Normalization)                                                  
                                                                 
 max_pooling2d (MaxPooling2  (None, 32, 32, 32)        0         
 D)                                                              
                                                                 
 dropout (Dropout)           (None, 32, 32, 32)        0         
                                                                 
 flatten (Flatten)           (None, 32768)             0         
                                                                 
 dense (Dense)               (None, 128)               4194432   
                                                                 
 batch_normalization_1 (Bat  (None, 128)               512       
 chNormalization)                                                
                                                                 
 dropout_1 (Dropout)         (None, 128)               0         
                                                                 
 dense_1 (Dense)             (None, 1)                 129       
                                                                 
=================================================================
Total params: 4196097 (16.01 MB)
Trainable params: 4195777 (16.01 MB)
Non-trainable params: 320 (1.25 KB)
_________________________________________________________________

Compile the Model

In [ ]:
# Compile the model
model3a.compile(optimizer=Adam(learning_rate = 0.001), loss='binary_crossentropy', metrics=['accuracy'])

Using Callbacks

In [ ]:
early_stopping = EarlyStopping(monitor='val_loss', patience=5)
model_checkpoint = ModelCheckpoint('best_model.h5', monitor='val_loss', save_best_only=True)

Fit and Train Model 3a

In [ ]:
# Train with the first augmented data set
history3a = model3a.fit(
    train_generator,
    callbacks=[early_stopping, model_checkpoint],
    steps_per_epoch=len(X_train) // 64,  # Batch size of 64
    epochs=20,
    validation_data=val_generator
)
Epoch 1/20
311/311 [==============================] - 53s 165ms/step - loss: 0.6135 - accuracy: 0.6914 - val_loss: 0.8522 - val_accuracy: 0.5108
Epoch 2/20
311/311 [==============================] - 49s 159ms/step - loss: 0.3870 - accuracy: 0.8381 - val_loss: 0.5153 - val_accuracy: 0.8121
Epoch 3/20
311/311 [==============================] - 49s 159ms/step - loss: 0.2739 - accuracy: 0.8990 - val_loss: 2.6805 - val_accuracy: 0.6214
Epoch 4/20
311/311 [==============================] - 49s 157ms/step - loss: 0.2596 - accuracy: 0.9096 - val_loss: 0.5436 - val_accuracy: 0.8470
Epoch 5/20
311/311 [==============================] - 49s 158ms/step - loss: 0.2510 - accuracy: 0.9184 - val_loss: 0.6963 - val_accuracy: 0.5128
Epoch 6/20
311/311 [==============================] - 49s 159ms/step - loss: 0.2263 - accuracy: 0.9264 - val_loss: 0.2010 - val_accuracy: 0.9311
Epoch 7/20
311/311 [==============================] - 48s 155ms/step - loss: 0.2290 - accuracy: 0.9258 - val_loss: 0.2963 - val_accuracy: 0.9081
Epoch 8/20
311/311 [==============================] - 53s 169ms/step - loss: 0.2113 - accuracy: 0.9294 - val_loss: 0.2145 - val_accuracy: 0.9143
Epoch 9/20
311/311 [==============================] - 50s 162ms/step - loss: 0.2194 - accuracy: 0.9282 - val_loss: 0.2024 - val_accuracy: 0.9373
Epoch 10/20
311/311 [==============================] - 51s 163ms/step - loss: 0.2106 - accuracy: 0.9303 - val_loss: 0.2726 - val_accuracy: 0.8984
Epoch 11/20
311/311 [==============================] - 52s 166ms/step - loss: 0.2089 - accuracy: 0.9296 - val_loss: 0.1663 - val_accuracy: 0.9443
Epoch 12/20
311/311 [==============================] - 50s 162ms/step - loss: 0.1949 - accuracy: 0.9336 - val_loss: 0.2008 - val_accuracy: 0.9235
Epoch 13/20
311/311 [==============================] - 51s 163ms/step - loss: 0.1838 - accuracy: 0.9380 - val_loss: 0.2264 - val_accuracy: 0.9391
Epoch 14/20
311/311 [==============================] - 51s 163ms/step - loss: 0.1819 - accuracy: 0.9378 - val_loss: 0.1513 - val_accuracy: 0.9435
Epoch 15/20
311/311 [==============================] - 51s 164ms/step - loss: 0.1762 - accuracy: 0.9418 - val_loss: 0.1578 - val_accuracy: 0.9469
Epoch 16/20
311/311 [==============================] - 52s 166ms/step - loss: 0.1699 - accuracy: 0.9414 - val_loss: 0.1667 - val_accuracy: 0.9415
Epoch 17/20
311/311 [==============================] - 53s 171ms/step - loss: 0.1731 - accuracy: 0.9390 - val_loss: 0.1515 - val_accuracy: 0.9475
Epoch 18/20
311/311 [==============================] - 51s 162ms/step - loss: 0.1784 - accuracy: 0.9397 - val_loss: 0.1619 - val_accuracy: 0.9511
Epoch 19/20
311/311 [==============================] - 51s 163ms/step - loss: 0.1647 - accuracy: 0.9432 - val_loss: 0.1762 - val_accuracy: 0.9381

Evaluating Model 3a¶

Calculating the Test Accuracy

In [ ]:
# Load the best model saved by ModelCheckpoint
Model3a = keras.models.load_model('best_model.h5')

# Evaluate model3a on test data
test_loss, test_accuracy = model3a.evaluate(test_images_normalized, test_labels)
print(f"Test Loss: {test_loss}, Test Accuracy: {test_accuracy}")
82/82 [==============================] - 1s 13ms/step - loss: 0.2452 - accuracy: 0.9238
Test Loss: 0.2451961785554886, Test Accuracy: 0.9238461256027222

Plot the train and validation accuracy

In [ ]:
#plotting the training and validation accuracy

# Extracting accuracy and validation accuracy from the history object
accuracy = history3a.history['accuracy']
val_accuracy = history3a.history['val_accuracy']

# Generating a range for epochs starting from 1 to the length of the accuracy list
epochs = range(1, len(accuracy) + 1)

# Creating a plot
plt.figure(figsize=(8, 8))

# Plotting both the training accuracy and validation accuracy
plt.plot(epochs, accuracy, 'b--', label='Training Accuracy')  # Blue dashed line for training accuracy
plt.plot(epochs, val_accuracy, 'r--', label='Validation Accuracy')  # Red dashed line for validation accuracy

# Adding labels and title for clarity
plt.title('Training and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')

# Showing the legend
plt.legend()

# Displaying the plot
plt.show()

Plotting the classification report and confusion matrix

In [ ]:
# Predictions Model 3a
predictions3a = Model3a.predict(test_images_normalized)
predictions3a = np.argmax(predictions3a, axis=1)
test_labels_decoded = np.argmax(test_labels_encoded, axis=1)

#Print results using a classification report
print(classification_report(test_labels_decoded, predictions3a))

# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(test_labels_decoded, predictions3a)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax
)
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()
82/82 [==============================] - 1s 14ms/step
              precision    recall  f1-score   support

           0       0.50      1.00      0.67      1300
           1       0.00      0.00      0.00      1300

    accuracy                           0.50      2600
   macro avg       0.25      0.50      0.33      2600
weighted avg       0.25      0.50      0.33      2600

Model 3b¶

In [ ]:
backend.clear_session() # Clearing the backend for new model

np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)

Trying different Augmentation Parameters (Model3b)

In [ ]:
# Create an instance of ImageDataGenerator with desired augmentations
train_datagen1 = ImageDataGenerator(
    rescale=1./255,           # Normalizing the pixel values
    rotation_range=20,       # Degree range for random rotations
    ##width_shift_range=0.2,   # Range (as a fraction of total width) for random horizontal shifts
    ##height_shift_range=0.2,  # Range (as a fraction of total height) for random vertical shifts
    shear_range=0.2,         # Shearing intensity
    zoom_range=0.02,          # Range for random zoom
    horizontal_flip=True,    # Randomly flip inputs horizontally
    fill_mode='nearest'
)

# Creating the validation data generator (normalization only, no augmentation)
val_datagen1 = ImageDataGenerator(rescale=1./255)

Setting up the Training and Validation Generator:

In [ ]:
# Setting up the alternative training generator
train_generator1 = train_datagen1.flow(
    x=X_train,
    y=y_train,
    batch_size=64,
    seed=42,
    shuffle=True
)

# Setting up the alternative validation generator
val_generator1 = val_datagen1.flow(
    x=X_val,
    y=y_val,
    batch_size=64,
    seed=42,
    shuffle=True
)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-22-a94b8ceb5884> in <cell line: 2>()
      1 # Setting up the alternative training generator
----> 2 train_generator1 = train_datagen1.flow(
      3     x=X_train,
      4     y=y_train,
      5     batch_size=64,

NameError: name 'train_datagen1' is not defined

Visualizing the second set of Augmented images (input for model3b)

In [ ]:
# Function to visualize images
def visualize_augmented_images(image_generator1, n_images):
    # Get a batch of images
    images, labels = next(image_generator1)

    # Set up the grid
    plt.figure(figsize=(10, 10))
    for i in range(n_images):
        plt.subplot(n_images // 4 + 1, 4, i + 1)
        plt.imshow(images[i])
        plt.title('Augmented Image')
        plt.axis('off')
    plt.tight_layout()
    plt.show()

# Visualize some augmented images
visualize_augmented_images(train_generator1, 8)

Building Model 3b¶

In [ ]:
model3b = Sequential()

model3b.add(Conv2D(32, (3, 3), activation='relu', padding="same", input_shape=(SIZE, SIZE, 3)))
model3b.add(BatchNormalization())
model3b.add(MaxPooling2D((2, 2)))
model3b.add(Dropout(0.2))

model3b.add(Flatten())
model3b.add(Dense(128, activation='relu'))
model3b.add(BatchNormalization())
model3b.add(Dropout(0.3))
model3b.add(Dense(1, activation='softmax'))

model3b.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 64, 64, 32)        896       
                                                                 
 batch_normalization (Batch  (None, 64, 64, 32)        128       
 Normalization)                                                  
                                                                 
 max_pooling2d (MaxPooling2  (None, 32, 32, 32)        0         
 D)                                                              
                                                                 
 dropout (Dropout)           (None, 32, 32, 32)        0         
                                                                 
 flatten (Flatten)           (None, 32768)             0         
                                                                 
 dense (Dense)               (None, 128)               4194432   
                                                                 
 batch_normalization_1 (Bat  (None, 128)               512       
 chNormalization)                                                
                                                                 
 dropout_1 (Dropout)         (None, 128)               0         
                                                                 
 dense_1 (Dense)             (None, 1)                 129       
                                                                 
=================================================================
Total params: 4196097 (16.01 MB)
Trainable params: 4195777 (16.01 MB)
Non-trainable params: 320 (1.25 KB)
_________________________________________________________________
In [ ]:
model3b = Sequential()

model3b.add(Conv2D(32, (3, 3), activation='relu', padding="same", input_shape=(SIZE, SIZE, 3)))
model3b.add(BatchNormalization())
model3b.add(MaxPooling2D((2, 2)))
model3b.add(Dropout(0.2))

model3b.add(Flatten())
model3b.add(Dense(128, activation='relu'))
model3b.add(BatchNormalization())
model3b.add(Dropout(0.3))
model3b.add(Dense(1, activation='sigmoid'))

model3b.summary()
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_1 (Conv2D)           (None, 64, 64, 32)        896       
                                                                 
 batch_normalization_2 (Bat  (None, 64, 64, 32)        128       
 chNormalization)                                                
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 32, 32, 32)        0         
 g2D)                                                            
                                                                 
 dropout_2 (Dropout)         (None, 32, 32, 32)        0         
                                                                 
 flatten_1 (Flatten)         (None, 32768)             0         
                                                                 
 dense_2 (Dense)             (None, 128)               4194432   
                                                                 
 batch_normalization_3 (Bat  (None, 128)               512       
 chNormalization)                                                
                                                                 
 dropout_3 (Dropout)         (None, 128)               0         
                                                                 
 dense_3 (Dense)             (None, 1)                 129       
                                                                 
=================================================================
Total params: 4196097 (16.01 MB)
Trainable params: 4195777 (16.01 MB)
Non-trainable params: 320 (1.25 KB)
_________________________________________________________________
In [ ]:
# Compile the model
model3b.compile(optimizer=Adam(learning_rate = 0.0001), loss='binary_crossentropy', metrics=['accuracy'])
In [ ]:
early_stopping = EarlyStopping(monitor='val_loss', patience=5)
model_checkpoint = ModelCheckpoint('best_model.h5', monitor='val_loss', save_best_only=True)

Fitting and Training Model 3b

In [ ]:
# Train with the second augmented data set
history3b = model3b.fit(
    train_generator1,
    callbacks=[early_stopping, model_checkpoint],
    steps_per_epoch=len(X_train) // 64,  # Batch size of 64
    epochs=20,
    validation_data=val_generator1,
    validation_steps=len(X_val) // 64,  # Batch size of 64
    verbose = 1
)
Epoch 1/20
311/311 [==============================] - 46s 144ms/step - loss: 0.6800 - accuracy: 0.6487 - val_loss: 0.7113 - val_accuracy: 0.5212
Epoch 2/20
311/311 [==============================] - 45s 144ms/step - loss: 0.5847 - accuracy: 0.7054 - val_loss: 0.7637 - val_accuracy: 0.5889
Epoch 3/20
311/311 [==============================] - 45s 144ms/step - loss: 0.5232 - accuracy: 0.7501 - val_loss: 0.6099 - val_accuracy: 0.6929
Epoch 4/20
311/311 [==============================] - 44s 143ms/step - loss: 0.4726 - accuracy: 0.7813 - val_loss: 0.6073 - val_accuracy: 0.6999
Epoch 5/20
311/311 [==============================] - 45s 143ms/step - loss: 0.4334 - accuracy: 0.8083 - val_loss: 0.4523 - val_accuracy: 0.7923
Epoch 6/20
311/311 [==============================] - 46s 149ms/step - loss: 0.4026 - accuracy: 0.8182 - val_loss: 0.3723 - val_accuracy: 0.8387
Epoch 7/20
311/311 [==============================] - 45s 145ms/step - loss: 0.3789 - accuracy: 0.8316 - val_loss: 0.6235 - val_accuracy: 0.7484
Epoch 8/20
311/311 [==============================] - 45s 145ms/step - loss: 0.3605 - accuracy: 0.8455 - val_loss: 0.4317 - val_accuracy: 0.7835
Epoch 9/20
311/311 [==============================] - 46s 147ms/step - loss: 0.3369 - accuracy: 0.8572 - val_loss: 0.3084 - val_accuracy: 0.8706
Epoch 10/20
311/311 [==============================] - 44s 143ms/step - loss: 0.3256 - accuracy: 0.8630 - val_loss: 0.3631 - val_accuracy: 0.8375
Epoch 11/20
311/311 [==============================] - 45s 143ms/step - loss: 0.3065 - accuracy: 0.8706 - val_loss: 0.4242 - val_accuracy: 0.8175
Epoch 12/20
311/311 [==============================] - 45s 143ms/step - loss: 0.2940 - accuracy: 0.8800 - val_loss: 1.4045 - val_accuracy: 0.5234
Epoch 13/20
311/311 [==============================] - 45s 144ms/step - loss: 0.2876 - accuracy: 0.8851 - val_loss: 0.4250 - val_accuracy: 0.8211
Epoch 14/20
311/311 [==============================] - 44s 142ms/step - loss: 0.2863 - accuracy: 0.8803 - val_loss: 0.4701 - val_accuracy: 0.7558
In [ ]:
# Train with the second augmented data set
history3b = model3b.fit(
    train_generator1,
    callbacks=[early_stopping, model_checkpoint],
    batch_size=64,
    epochs=20,
    validation_data=val_generator1,
    verbose = 1
)
Epoch 1/20
312/312 [==============================] - 44s 142ms/step - loss: 0.2730 - accuracy: 0.8894 - val_loss: 0.3333 - val_accuracy: 0.8646
Epoch 2/20
312/312 [==============================] - 46s 146ms/step - loss: 0.2618 - accuracy: 0.8954 - val_loss: 0.2727 - val_accuracy: 0.8958
Epoch 3/20
312/312 [==============================] - 45s 144ms/step - loss: 0.2560 - accuracy: 0.8960 - val_loss: 0.2775 - val_accuracy: 0.8816
Epoch 4/20
312/312 [==============================] - 45s 144ms/step - loss: 0.2514 - accuracy: 0.8994 - val_loss: 0.3483 - val_accuracy: 0.8458
Epoch 5/20
312/312 [==============================] - 45s 144ms/step - loss: 0.2482 - accuracy: 0.9025 - val_loss: 0.5643 - val_accuracy: 0.6663
Epoch 6/20
312/312 [==============================] - 47s 149ms/step - loss: 0.2450 - accuracy: 0.9017 - val_loss: 0.3374 - val_accuracy: 0.8462
Epoch 7/20
312/312 [==============================] - 45s 143ms/step - loss: 0.2367 - accuracy: 0.9068 - val_loss: 0.2573 - val_accuracy: 0.8998
Epoch 8/20
312/312 [==============================] - 45s 143ms/step - loss: 0.2282 - accuracy: 0.9098 - val_loss: 1.1893 - val_accuracy: 0.5363
Epoch 9/20
312/312 [==============================] - 45s 144ms/step - loss: 0.2281 - accuracy: 0.9121 - val_loss: 0.4334 - val_accuracy: 0.8355
Epoch 10/20
312/312 [==============================] - 45s 145ms/step - loss: 0.2237 - accuracy: 0.9143 - val_loss: 0.3218 - val_accuracy: 0.8540
Epoch 11/20
312/312 [==============================] - 44s 142ms/step - loss: 0.2170 - accuracy: 0.9139 - val_loss: 0.4781 - val_accuracy: 0.8353
Epoch 12/20
312/312 [==============================] - 45s 143ms/step - loss: 0.2218 - accuracy: 0.9142 - val_loss: 0.2977 - val_accuracy: 0.8708

Evaluating Model 3b¶

Calculating the Test Accuracy

In [ ]:
# Load the best model saved by ModelCheckpoint
Model3b = keras.models.load_model('best_model.h5')

# Evaluate model3a on test data
test_loss, test_accuracy = model3b.evaluate(test_images_normalized, test_labels)
print(f"Test Loss: {test_loss}, Test Accuracy: {test_accuracy}")
82/82 [==============================] - 1s 14ms/step - loss: 0.3528 - accuracy: 0.8335
Test Loss: 0.35281237959861755, Test Accuracy: 0.8334615230560303

Plotting the classification report and confusion matrix

In [ ]:
# Predictions Model 3b
predictions = Model3b.predict(test_images_normalized)
predictions = np.argmax(predictions, axis=1) # Convert one-hot to index
test_labels_decoded = np.argmax(test_labels_encoded, axis=1) # Convert one-hot to index

#Print results using a classification report
print(classification_report(test_labels_decoded, predictions))

# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(test_labels_decoded, predictions)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax
)
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()
82/82 [==============================] - 1s 15ms/step
              precision    recall  f1-score   support

           0       0.50      1.00      0.67      1300
           1       0.00      0.00      0.00      1300

    accuracy                           0.50      2600
   macro avg       0.25      0.50      0.33      2600
weighted avg       0.25      0.50      0.33      2600

Plot the train and validation accuracy

In [ ]:
#plotting the training and validation accuracy

# Extracting accuracy and validation accuracy from the history object
accuracy = history3b.history['accuracy']
val_accuracy = history3b.history['val_accuracy']

# Generating a range for epochs starting from 1 to the length of the accuracy list
epochs = range(1, len(accuracy) + 1)

# Creating a plot
plt.figure(figsize=(8, 8))

# Plotting both the training accuracy and validation accuracy
plt.plot(epochs, accuracy, 'b--', label='Training Accuracy')  # Blue dashed line for training accuracy
plt.plot(epochs, val_accuracy, 'r--', label='Validation Accuracy')  # Red dashed line for validation accuracy

# Adding labels and title for clarity
plt.title('Training and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')

# Showing the legend
plt.legend()

# Displaying the plot
plt.show()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-26-31ae517fdfba> in <cell line: 4>()
      2 
      3 # Extracting accuracy and validation accuracy from the history object
----> 4 accuracy = history3b.history['accuracy']
      5 val_accuracy = history3b.history['val_accuracy']
      6 

NameError: name 'history3b' is not defined

Why did we want to try Augmentation?

  • Since false negatives can be particularly costly in a medical context, ensuring the data is representative of all the variations of parasitized cells is important
  • these models (Model3a and Model3b) introduce data augmentated images to the training and validation datasets
    • In this way, the use of data augmentation aims to introduce such variability into the dataset to potentially improving the robustness and generalization of the model.

Model 3a analysis¶

Model Architecture:

  • "Model3a" has a simpler architecture compared to the previous models, which might indicate a less complex feature extraction
  • It consists of a single convolutional block with batch normalization and dropout followed by two dense layers, with the final layer using a sigmoid activation function, which is suitable for binary classification tasks.

Training Performance:

  • The training process shows increasing accuracy over epochs
    • training accuracy improves significantly over epochs, while the validation accuracy has wider variations and does not improve as consistently.
  • note: validation accuracy has an increasing trend but it exhibits fairly extreme fluctuation
    • such fluctuation might indicate the model's struggle to generalize the augmented data features effectively.

Test Performance:

  • The test accuracy is lower than the previous models at approximately 92.38%. This may be due to the model's simplicity and the use of augmented data, which can sometimes introduce additional noise into the training process.

Classification Report and Confusion Matrix:

  • The classification report is misleading due to the output indicating a perfect classification for one class and complete misclassification for the other. perhaps there is an error?
  • The confusion matrix also indicates that the model predicted only one class for all test instances, which is a clear sign of model failure, likely due to incorrect handling of the output layer or the data. Training and Validation Accuracy Graph: The accuracy graph shows a starting accuracy of around 54% which suggests the model was learning from a nearly random state. However, it doesn't reach the same level of accuracy as the training set, indicating a discrepancy in learning between the training and validation sets.

Now, let us try to use a pretrained model like VGG16 and check how it performs on our data.

In [ ]:
backend.clear_session() # Clearing the backend for new model

np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)

Pre-trained model (VGG16)¶

  • Import VGG16 network upto any layer you choose
  • Add Fully Connected Layers on top of it
In [ ]:
#test
# Load VGG16 model up to the third block of layers
VG16_model = VGG16(weights='imagenet', include_top=False, input_shape=(SIZE, SIZE, 3))

# Freeze the layers of the base model to preserve learned features, prevent overfitting, and improve training efficiency
for layer in VG16_model.layers[:9]:  # Freezing the first 9 layers (up to the third block). by freezing the first 9 layers,
    layer.trainable = False

#Add Fully Connected Layers: this block of code is essentially creating a
  #transfer learning model by leveraging the "transfer" of knowledge from a pre-trained model (VGG16)

x = Flatten()(VG16_model.output) #utput of the pre-trained VGG16 model and flattens it

# Fully connected layers
x = Dense(128, activation='relu')(x)
x = Dropout(0.1)(x) #regularization purposes, to prevent overfitting to the training data by randomly setting a fraction of input units to 0 at each update during training time.
x = Dense(64, activation='relu')(x)
x = Dropout(0.2)(x)

# Output layer
predictions = Dense(1, activation='sigmoid')(x)

# This is the model we will train
model_vgg16 = Model(inputs=VG16_model.input, outputs=predictions)

# Summary of the model
model_vgg16.summary()
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 64, 64, 3)]       0         
                                                                 
 block1_conv1 (Conv2D)       (None, 64, 64, 64)        1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 64, 64, 64)        36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 32, 32, 64)        0         
                                                                 
 block2_conv1 (Conv2D)       (None, 32, 32, 128)       73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 32, 32, 128)       147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 16, 16, 128)       0         
                                                                 
 block3_conv1 (Conv2D)       (None, 16, 16, 256)       295168    
                                                                 
 block3_conv2 (Conv2D)       (None, 16, 16, 256)       590080    
                                                                 
 block3_conv3 (Conv2D)       (None, 16, 16, 256)       590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, 8, 8, 256)         0         
                                                                 
 block4_conv1 (Conv2D)       (None, 8, 8, 512)         1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, 8, 8, 512)         2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, 8, 8, 512)         2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, 4, 4, 512)         0         
                                                                 
 block5_conv1 (Conv2D)       (None, 4, 4, 512)         2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, 4, 4, 512)         2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, 4, 4, 512)         2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, 2, 2, 512)         0         
                                                                 
 flatten (Flatten)           (None, 2048)              0         
                                                                 
 dense (Dense)               (None, 128)               262272    
                                                                 
 dropout (Dropout)           (None, 128)               0         
                                                                 
 dense_1 (Dense)             (None, 64)                8256      
                                                                 
 dropout_1 (Dropout)         (None, 64)                0         
                                                                 
 dense_2 (Dense)             (None, 1)                 65        
                                                                 
=================================================================
Total params: 14985281 (57.16 MB)
Trainable params: 13839873 (52.79 MB)
Non-trainable params: 1145408 (4.37 MB)
_________________________________________________________________

Note on Freezing Layers:

  • the first 9 layers of VGG16 are frozen, meaning their weights will not be updated during training. This is because these layers have already learned very general features from a large and diverse dataset (ImageNet).
  • therefore freezing the first layers of a pre-trained model when used for transfer learning serves the following purposes:
    • Preserve Learned Features: The early layers of a convolutional neural network usually learn to detect low-level features like edges and textures, which are general-purpose features applicable to a wide range of image recognition tasks. Freezing these layers preserves this valuable knowledge.
    • Prevent Overfitting: Fine-tuning too many parameters on a small dataset can lead to overfitting. By freezing the first few layers, you reduce the number of trainable parameters, which helps to avoid overfitting to the training data.
    • Improve Training Efficiency: Training only the unfrozen, higher-level layers is computationally less expensive. Since the early layers are not being updated, you save on the computational cost required to propagate gradients through those layers during backpropagation.
    • Improving Stability: The early layers have already been optimized on a large and diverse dataset, and altering these optimized features with a small dataset may lead to instability in learning, and the model might not converge to a good solution.

Compiling the model

In [ ]:
model_vgg16.compile(optimizer=Adam(learning_rate=0.0001), loss='binary_crossentropy', metrics=['accuracy'])

using callbacks

In [ ]:
early_stopping = EarlyStopping(monitor='val_loss', patience=5)
model_checkpoint = ModelCheckpoint('best_model.h5', monitor='val_loss', save_best_only=True)

Fit and Train the model

In [ ]:
#test
# Train with the second augmented data set
history_vgg16 = model_vgg16.fit(
    train_generator,
    callbacks=[early_stopping, model_checkpoint],
    steps_per_epoch=len(X_train) // 64,  # Batch size of 64
    epochs=20,
    validation_data=val_generator,
    validation_steps=len(X_val) // 64,  # Batch size of 64
    verbose = 1
)
Epoch 1/20
311/311 [==============================] - 614s 2s/step - loss: 0.1213 - accuracy: 0.9556 - val_loss: 0.0642 - val_accuracy: 0.9760
Epoch 2/20
311/311 [==============================] - 611s 2s/step - loss: 0.0672 - accuracy: 0.9784 - val_loss: 0.0517 - val_accuracy: 0.9800
Epoch 3/20
311/311 [==============================] - 609s 2s/step - loss: 0.0612 - accuracy: 0.9791 - val_loss: 0.0471 - val_accuracy: 0.9826
Epoch 4/20
311/311 [==============================] - 614s 2s/step - loss: 0.0589 - accuracy: 0.9803 - val_loss: 0.0415 - val_accuracy: 0.9830
Epoch 5/20
311/311 [==============================] - 613s 2s/step - loss: 0.0548 - accuracy: 0.9804 - val_loss: 0.0502 - val_accuracy: 0.9806
Epoch 6/20
311/311 [==============================] - 611s 2s/step - loss: 0.0540 - accuracy: 0.9812 - val_loss: 0.0444 - val_accuracy: 0.9826
Epoch 7/20
311/311 [==============================] - 608s 2s/step - loss: 0.0506 - accuracy: 0.9820 - val_loss: 0.0456 - val_accuracy: 0.9824
Epoch 8/20
311/311 [==============================] - 609s 2s/step - loss: 0.0506 - accuracy: 0.9830 - val_loss: 0.0429 - val_accuracy: 0.9824
Epoch 9/20
311/311 [==============================] - 605s 2s/step - loss: 0.0467 - accuracy: 0.9825 - val_loss: 0.0427 - val_accuracy: 0.9834

Evaluating the model

In [ ]:
# Load the best model saved by ModelCheckpoint
model_vgg16 = keras.models.load_model('best_model.h5')

# Evaluate model3a on test data
test_loss, test_accuracy = model_vgg16.evaluate(test_images_normalized, test_labels)
print(f"Test Loss: {test_loss}, Test Accuracy: {test_accuracy}")
82/82 [==============================] - 30s 369ms/step - loss: 0.0486 - accuracy: 0.9796
Test Loss: 0.04861888661980629, Test Accuracy: 0.9796153903007507

Plot the train and validation accuracy

In [ ]:
#plotting the training and validation accuracy

# Extracting accuracy and validation accuracy from the history object
accuracy = history_vgg16.history['accuracy']
val_accuracy = history_vgg16.history['val_accuracy']

# Generating a range for epochs starting from 1 to the length of the accuracy list
epochs = range(1, len(accuracy) + 1)

# Creating a plot
plt.figure(figsize=(8, 8))

# Plotting both the training accuracy and validation accuracy
plt.plot(epochs, accuracy, 'b--', label='Training Accuracy')  # Blue dashed line for training accuracy
plt.plot(epochs, val_accuracy, 'r--', label='Validation Accuracy')  # Red dashed line for validation accuracy

# Adding labels and title for clarity
plt.title('Training and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')

# Showing the legend
plt.legend()

# Displaying the plot
plt.show()

Plotting the classification report and confusion matrix

In [ ]:
# Predictions Model VGG16
predictionsvgg16 = model_vgg16.predict(test_images_normalized)
predictionsvgg16 = np.argmax(predictionsvgg16, axis=1) # Convert one-hot to index
test_labels_decoded = np.argmax(test_labels_encoded, axis=1) # Convert one-hot to index

#Print results using a classification report
print(classification_report(test_labels_decoded, predictionsvgg16))

# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(test_labels_decoded, predictionsvgg16)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax
)
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()
82/82 [==============================] - 30s 358ms/step
              precision    recall  f1-score   support

           0       0.50      1.00      0.67      1300
           1       0.00      0.00      0.00      1300

    accuracy                           0.50      2600
   macro avg       0.25      0.50      0.33      2600
weighted avg       0.25      0.50      0.33      2600

VGG16 Model Insights and Observations:¶

Model Architecture:

  • The model utilizes the VGG16 architecture with weights preloaded from ImageNet. This is a common practice for transfer learning, which leverages pre-trained networks to improve learning efficiency and accuracy in a new task with limited data. The first 9 layers are frozen to utilize the learned features from ImageNet without modifying them during the initial training, which helps in transferring the knowledge and not overfitting quickly.

Training

  • this model has a very large number of trainable parametrs (over 57 million), which is why the training time is so long
  • therefore, this solution is computationally inefficient and may not be practical in a low-resource environment
  • so although test accuracy for this model seems to be be slightly higher than the base model, this improvement is too minor to justify adopting such a complex model that requires so much more computational power and memory than the base model.
  • this model is powerful but it's structure and requirements are unsustainable in low-resource environments

Suggested Improvements:

  • perhaps unfreezing some cells may allow for the model to perform better on the validation set. This will allow for more feature adaptation to the specific task of identifying parasitic cells.
  • Experiment with the learning rate. optimizing this rate can lead to better convergence.

Improvements that can be done:

  • Can the model performance be improved using other pre-trained models or different CNN architecture?
  • You can try to build a model using these HSV images and compare them with your other models.

Proposed Model: The "Base Model"¶

Base Model Specifications:

  • The final model is a CNN with strategic layering of convolutional layers activated by ReLU functions, pooling layers to distill features, dropout layers to mitigate overfitting, and dense layers for the final classification. Structured sequentially, the model is specialized for 2D image data processing, targeting the detection of malaria parasites in blood cells with remarkable accuracy.

Base Model Accuracy Rates

  • the base model maintains the following performance metrics:
    • uninfected cells:
      • precision -- 0.98
      • recall -- 0.99
      • f1-score -- 0.99
    • parasitized cells:
      • precision -- 0.98
      • recall -- 0.98
      • f1-score -- 0.99

Why is the "base Model" the 'best' model for Maralia Diagnosis?¶

  • superier ability to accurately classify parasitized and uninfected cells
    • the model demonstrates reliability, with an overall prediction accuracy measure of 98%
  • Highest Recall Rate: In a medical context, ethical considerations call for the model to focus on minimizing false negatives, because this outcome bears the largest consequence. For the malaria detection problem, minimizing the false-negative rate is vital to ensure that infected patients receive treatment. Therefore, the selected model must have a high sensitivity (recall) to parasitized cells.
    • the base model misclassified infected cells as 'healthy' at a rate of approximately 1% (27/2600). this rate is lower than any other model within the scope of this project (Although it is improtant to note that there is still room for improvement!)
  • Good Balance Between Precision and Recall: the base model also has a very high precision rate, indicating that the rate of false positive diagnosis is also low
    • such high precision is good because we also want to limit the number of false positives, which could lead to unnecessary treatment or alarm and related costs.
  • the base model has good generalization Ability, as indicated by its ability to perform well on unseen data with an test accuracy measure of approximately 98%
    • The validation accuracy is below the training accuracy from around epoch 9 onward, indicating that the model does not seem to be overfitting on the training data
  • the model is fairly simple and computationally efficient
    • computational efficiency is an important consideration given that the model will most likely be deployed in a low-resource environment

Comparatively, why is the Base Model the 'best' solution to the problem?¶

- The Base Model is superior in comparison to Model1 and Model2:

  • Although the Base Model has more trainable parameters than Model1 and Model2, it achieves higher recall and accuracy, making it a better choice for malaria diagnosis. The slight increase in computational complexity (more trainable parameters) is justified by the potential for more accurate and life-saving diagnoses.
  • Accuracy and Recall Importance: In medical diagnostics, the stakes are significantly higher. Accuracy and recall (the ability to correctly identify true positive cases) are paramount. A model with higher recall and accuracy, even if it's slightly more complex, is preferable because the cost of a false negative (i.e., failing to identify a case of malaria) can be life-threatening. In this domain, the emphasis is on the quality and reliability of the diagnosis over computational convenience.

- The Base model is superior to the Augmented and VGG16 models:

  • the accuracy of these models actually show a decline in accuracy compared to the base model, even though these models have become significantly more complex
    • Models with Augmented Images: While data augmentation can sometimes enhance a model's ability to generalize by simulating a variety of imaging conditions, it doesn't translate to higher accuracy or recall. Because the augmented model does not significantly improve these metrics, the added complexity is not justified, especially in resource-limited settings.
    • VGG16 Model, being a deep and complex architecture, contains a significantly higher number of trainable parameters (13,839,938 trainable parameters). While it might offer some improvement in accuracy, the gain may not be substantial enough to offset the considerable increase in computational demands. In settings like developing countries or remote areas, where resources are constrained, such models may not be practical due to their high computational requirements.

Base Model is therfore the 'best' model

  • With fewer trainable parameters, the base model is less prone to overfitting compared to these models
    • Simpler models, by virtue of their limited complexity, often generalize better and thus their predictions can be more reliable and interpretable.
  • The Base Model strikes a balance between computational demand and diagnostic performance.
    • In regions with limited access to advanced computing resources, a model that is both effective and efficiently operable on less advanced hardware is more sustainable and broadly accessible.

In summary, the choice of the Base Model for malaria diagnosis is guided by its superior accuracy and recall, making it highly reliable for medical use. While it has more trainable parameters than some simpler models, it remains computationally feasible and does not excessively burden the available resources, making it a suitable choice for deployment in diverse settings, including those with limited computational capabilities.

Recommendations for Implimentation¶

a. Key recommendations to impliment the solution (i.e. the "Base Model"):¶

All the below efforts aim to maintain and enhance the accuracy and reliability of the malaria diagnostic tool in the face of changing real-world conditions.

  • use a larger and more diverse dataset to validate the model, making it more robust
  • Implementing a user-friendly interface for healthcare professionals to interact with the model
    • user-friendly interfaces enable professionals to adopt the tool easily, without needing extensive technical knowlwdge
      • A well-designed interface can reduce the learning curve and training requirements for new users, which can lead to quicker adoption and less resistance from staff. It can also provide built-in guidance and support, thereby reducing errors and improving diagnostic accuracy.
    • simplicity: tool accessable to all levels of healthcare providers
    • the adoption of this automated tool takes time off of the healthcare professional's time, which can then be better spent focused on the patient
    • Efficiency: An interface that is intuitive to navigate can save time, reducing the gap between image acquisition and diagnosis
      • this is particularly helpful in high-pressure environments where time is of the essence, such as during disease outbreaks or in high-volume testing centers.
        • These conditions are not uncommon in low resource hospitals in 'malaria areas', which are predominantly located in underdeveloped regions characterized by high malaria admittance volumes and inadequate time and resources
    • Lastly, a user-friendly interfaces provide a feedback mechanism, which is valuable for continuous improvement of the tool.
  • Establish a protocol for periodic re-training of the model to adapt to new data
    • Why is this protocol enforcement crucial?
      • as the disease evolves(e.g. due to mutations of the malaria parasite), new types of data may come about
      • if the model is not retrained on this new data, its performance on current, unseen data with deteriorate
      • The characteristics of the malaria parasite, as well as blood cell images, may change due to several factors such as changes in staining techniques, evolution of the parasite, or even the introduction of new microscopy equipment.
      • therefore, re-training the model with new variety of data reflecting these changes improves robustness of the tool.
      • Note: as the model is exposed to more and varied data over time, re-training can help improve its predictive performance (This is in line with the principle of continuous learning in machine learning, where models evolve and improve with more data).
    • In practice, we would have to either hire a team or create a system that performs this function
      • This team would also monitor the model's performance over time, retrain the models on the new data periodically, and deploy updated models

b. Key actionables for stakeholders:¶

  • Healthcare Staff Training: for successful implimentation of the diagnostic tool, users must be proficient in using it
    • proficiency of its use involves understanding how to input data into the system, interpret the results, and integrate findings to support the patient's diagnosis
    • Training ensures that healthcare professionals can leverage the tool effectively, enhancing diagnostic accuracy and patient outcomes
  • Develop a pipeline for continuous data collection and model updating
    • In the dynamic field of healthcare, particularly in infectious diseases like malaria, pathogens can evolve, and diagnostic parameters can change.
    • Continuous data collection ensures that the AI model is exposed to the latest strains of the parasite and adapts to any emerging patterns.
    • Regular model updates are necessary to maintain the tool's accuracy and relevance, ensuring it remains a reliable asset in malaria diagnosis.
  • Secure necessary regulatory approvals for clinical use of AI tools.
    • before a medical AI tool can be deployed in a clinical setting, it must comply with regulatory standards to ensure patient safety and data security.
    • Obtaining these approvals is a critical step in the process, as it legitimizes the tool's use in healthcare settings and assures stakeholders of its safety and efficacy.

c. Expected benefits and costs:¶

  • Benefits:
    • reduced time for diagnosis
    • potential for earlier treatment
    • decreased workload for laboratory technicians.
  • Costs:
    • initial development and implementation
    • ongoing maintenance
    • additional training for medical staff.
  • Therefore assumptions might include improved patient outcomes due to faster diagnosis and treatment, and long-term cost savings from reduced manual labor.

/

Hypothetical Dollar Cost-Benefit Analysis:

Cost Assumptions and Estimations:

  • Development and Implementation: Assuming a development cost of 500,000 dollars for the initial setup, software creation, and integration into existing systems.
  • Maintenance: Ongoing annual maintenance at 15% of initial development costs would be 75,000 dollars per year
  • Training: Training for medical staff might cost around 50,000 dollars initially and 10,000 dollars annually for new staff or refresher courses

Benefit Assumptions and estimations:

  • Increased Efficiency: Automation can process hundreds of slides per day.
    • If we assume each diagnosis saves 30 minutes of a technician's time, for an average hourly wage of 30 dollars, the savings would be 15 dollars per diagnosis.
    • At 100 diagnoses per day, this results in 1,500 dollars in daily savings, or approximately 547,500 dollars annually.
  • Improved Outcomes:
    • Faster diagnosis leads to quicker treatment, which can reduce hospital stays.
    • If early diagnosis reduces one day of hospitalization on average, and the hospital cost is 2,000 dollars per day
      • for 100 patients annually, this results in 200,000 dollars saving per year.
  • Reduced Labor Costs:
    • If the system reduces the need for one full-time technician position, saving an annual salary of 60,000 dollars, the 10-year saving would be 600,000 dollars

10-Year Profit Estimation Calculation: Over 10 years, assuming the savings and costs remain constant, and without adjusting for inflation:

. Total Costs = Development (500,000) + Maintenance (10 years 75,000) + Training (50,000 + 9 years 10,000) = 1,250,000

. Total Benefits = Efficiency Savings (10 years 547,500) + Hospital Stay Reductions (10 years 200,000) + Labor Savings (10 years * 60,000) = 8,075,000

The estimated net profit would be Total Benefits - Total Costs = 8,075,000 - 1,250,000 = 6,825,000 dollars over 10 years.

Rationale Note: These estimates are based on researched facts about the impact of automation and AI in healthcare, such as increased diagnostic speeds, cost savings from labor reduction, and improved patient outcomes due to earlier intervention.

d. Potential risks or challenges:

Note: the accuracy of the model in real-world settings may differ from test conditions.

Business Risks:

  • Accuracy and Reliability: AI models may make incorrect diagnoses leading to potential health risks for patients and liability for healthcare providers
    • The model's dependency on high-quality image data and potential bias toward the data it was trained on are challenges that require ongoing attention
      • Additional data from diverse sources could further improve robustness and generalizability
  • Integration: the system might not integrate well with existing healthcare IT systems. This might cuase disruptions and incur additional costs
  • Slow adoption:resistance from healthcare professionals due to trust issues with AI systems, leading to underutilization of the investment.
  • Data Security: risk of data breaches, which could compromise patient privacy and result in legal and financial repercussions
  • Regulatory Approval: Difficulty in obtaining regulatory approval, which could delay implementation and increase costs.
  • Obsolescence: Rapid advancements in AI could render the system obsolete, requiring further investment sooner than anticipated

e. Further analysis and associated problems:

  • we can stInvestigating the model's performance across different demographic groups to ensure no bias.
  • Exploring the integration of the model with mobile devices for use in remote locations.
  • Assessing the model's interpretability by healthcare professionals for trust and reliability.

.

Concluding Remark:

This model, with continuous improvement and integration into healthcare systems, has the potential to significantly enhance malaria diagnosis, treatment, and management, particularly in regions where the disease is prevalent and resources are limited.

Potential Improvements to the final model:¶

Train the base model with different splits:

  • To further validate the effectiveness of the split, you can repeat the process with different splits of the validation data and observe if the results are consistent.

Use Cross-Validation:

  • Implement cross-validation to ensure the model's performance is consistent across different subsets of the data.
  • For a more thorough assessment, consider using k-fold cross-validation.
    • This technique involves dividing the data into k subsets and training the model k times, each time using a different subset as the validation set and the remaining data as the training set.

Hyperparameter Tuning:

  • Use grid search or random search to fine-tune the hyperparameters of the model.

Explore more complex architectures or regularization techniques to enhance the model's ability to learn from the data without overfitting.

  • for example, use of more advanced architectures, like ResNet or Inception could be explored. such archtectures might capture features more effectively.
  • experiment with different Regularization Techniques, for example, employ dropout, L1/L2 regularization to prevent overfitting

Ensemble Methods to combine the predictions from different models to improve performance and stability.

For additional tests and preprocessing steps:¶

Outlier Detection:

  • Remove or correct outliers in the data that could lead to overfitting.

Domain Expert Involvement:

  • Collaborate with healthcare professionals to understand which features are most indicative of malaria to focus the model's learning.

External Validation:

  • Test the model on an external dataset to evaluate its generalizability to other settings and populations.